HITS' Graph-based System at the NTCIR-9 Cross-lingual Link Discovery Task

نویسندگان

  • Angela Fahrni
  • Vivi Nastase
  • Michael Strube
چکیده

This paper presents HITS’ system for the NTCIR-9 crosslingual link discovery task. We solve the task in three stages: (1) anchor identification and ambiguity reduction, (2) graphbased disambiguation combining different relatedness measures as edge weights for a maximum edge weighted clique algorithm, and (3) supervised relevance ranking. In the fileto-file evaluation with Wikipedia ground-truth the HITS system is the top-performer across all measures and subtasks (English-2-Chinese, English-2-Japanese and English-2Korean). In the file-2-file and anchor-2-file evaluation with manual assessment, the system outperforms all other systems on the English-2-Japanese subtask and is one of the top-three performing systems for the two other subtasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-filtering Method Based Cross-lingual Link Discovery

This paper describes cross-lingual link discovery method of ISTIC used in the system evaluation task at NTCIR-9. In this year's evaluation, we participated in cross-lingual link discovery task from English to Chinese. In this paper, we mainly describe our understanding for CLLD, the key techniques of our system, and the evaluation results.

متن کامل

Overview of the NTCIR-10 Cross-Lingual Link Discovery Task

This paper presents an overview of NTCIR-10 Cross-lingual Link Discovery (CrossLink-2) task. For the task, we continued using the evaluation framework developed for the NTCIR-9 CrossLink-1 task. Overall, recommended links were evaluated at two levels (file-to-file and anchor-to-file); and system performance was evaluated with metrics: LMAP, R-Prec and P@N.

متن کامل

Overview of the NTCIR-9 Crosslink Task: Cross-lingual Link Discovery

This paper presents an overview of NTCIR-9 Cross-lingual Link Discovery (Crosslink) task. The overview includes: the motivation of cross-lingual link discovery; the Crosslink task definition; the run submission specification; the assessment and evaluation framework; the evaluation metrics; and the evaluation results of submitted runs. Cross-lingual link discovery (CLLD) is a way of automaticall...

متن کامل

Automated Cross-lingual Link Discovery in Wikipedia

At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-l...

متن کامل

An evaluation framework for cross-lingual link discovery

Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011